Improving the performance of a FastStats Data Load

Introduction

This document gives some guidance on the steps that can be taken to improve the speed of a FastStats data load.

Analyzing the Progress Log

The first thing that should be considered when trying to improve the performance of a FastStats Data Load is the progress log. A FastStats load consists of a number of separate stages: Pre-Load Actions, Auto Discovery, Load / Join and Post-load actions. Of these the most expensive operation is likely to be the Load / Join. The AutoDiscovery stage simply reads through the input data files and should be comparatively fast. Pre- and Post-Load actions are typically fast, but this depends entirely on the actions defined.

The Load / Join will be divided up into two sections. The first section will be 'Sort' and the second section 'Build'.

Log messages are marked with the current time so it is easy to see how long a process has taken. By concentrating on speeding up the slowest processes the greatest gains can be made – there is no point trying to optimize a process that takes a relatively short amount of time.

In in the 'log' directory a file called 'Build Statistics Log.txt' is created and appended to for every successful build. This gives a long term log of how the system is growing over time (both in terms of data volume but also in terms of build time).

Hardware/Operating System

Disk

A FastStats data load is a disk intensive process. For this reason it is important to ensure that your data load hardware has the fastest possible disks. Data load performance can be maximised by making sure:

Your disk has good read/write access speed. This can have a big effect on the speed of the load.
You are building on a local disk (not over a network)
You are using local paths like ‘C:\’ instead of UNC network paths even if the disk is local to the machine
Anti-virus settings have taken FastStats Designer into account - see here

Processor

A FastStats data load is a CPU intensive process. Parts of the build process use multiple threads to improve performance (Extraction, Auto-discovery, Sorting) and will benefit from multiple cores. However a significant proportion of the data load is still single-threaded and will not make use of multiple cores. To maximise the overall speed of the data load Apteco recommends choosing processors that maximise single-thread performance. One source of single-thread benchmark results for a variety of CPUs is available here.

FastStats Designer has a 'Benchmark' feature that will give a score for the single thread performance of you CPU.

Benchmark Score	Advice
<10000	Your CPU is under performing and FastStats processing on this machine may be slower than expected.
10000-20000	Your CPU performance is adequate. FastStats queries should run quickly but if you have a large system you would benefit from a faster CPU.
20000-30000	Your CPU performance is good.
>30000	Your CPU performance is excellent.

The highest benchmark score we have seen so far (Nov 2022) is ~46,000 from a 12th Gen Intel Core i9-12900

Power Policy

One potential reason for low CPU single thread performance is that the machine is configured with a restrictive power policy in order to save power. This is often the case with newly commissioned servers that may be set a 'Balanced' power policy.

You should check your BIOS settings and Operating System Power settings to ensure that your CPU is allowed to run at maximum performance. We recommend that BIOS power settings should be set to "OS Control" so that the CPU power policy is delegated to the Operating System which is better able to judge the machine workload than the BIOS.

Data Table Extraction

Extracting data from a database is dependent on the following factors:

Database performance, including the performance of the database machine and the current level of load the machine is experiencing.
Network bandwidth to the database machine
Local processor speed
Local disk speed

It possible to limit the number of tables that are extracted simultaneously in order to reduce the workload on your source database. If this feature is used then it is possible to control the order of extraction by altering the 'Priority' value on the Extract Options of each data source. Extracts that take longer should be given a higher priority so they start sooner.

Auto-Discovery

The Auto-Discovery process is used to automatically determine:

Data types for variables that have not been manually defined
Decodes from the input data
Key types and Key sizes

It is possible to skip the auto discovery process for large tables as long as these parameters have been defined in advance. Right on the table heading in the table relationship diagram and uncheck the 'Auto Discovery' option to skip auto discovery for a particular table. Define the key type and key size on the table relationship diagram on the 'Properties' dialog of each table.

Single Parent Record Lookups

Large lookup tables can have a big impact on data load speed. Lookups by default are joined using an index instead of a sort. In some cases it is possible to define a lookup as a SPL (Single Parent-record Lookup), SPL's are joined using a sort which can be much faster and can cope with larger lookup tables.

Normal Lookups - Rows are retrieve multiple times, used in the case of a few rows being copied and duplicated onto many parent record rows. E.g. Flattening the example 'Destination Details' table onto the Holiday table.

Single Parent Record Lookups - Rows are only retrieved once, used in the case of extra columns being appended to a parent table. E.g. If we had a table called 'People Details' that extra information about some of our customers in our example People table we could mark this table as a SPL to improve load performance. More details on SPL's here.

Multi Parent Record Lookups

Multi-Parent Lookup tables that are wide (lots of columns) and join to a parent table containing a large number of rows may benefit from being marked as 'Join at Query Time' rather than the default of 'Join at Build Time'. This can result in improved load performance, a reduction in disk space and faster query time at the expense of increased memory use during while running the query.